Combination of SPLICE and Feature Normalization for Noise Robust Speech Recognition

نویسندگان

  • Tsunenobu Kai
  • Masayuki Suzuki
  • Keigo Chijiiwa
  • Nobuaki Minematsu
  • Keikichi Hirose
چکیده

It is well-known that the performance of automatic speech recognition (ASR) systems are easily affected by acoustic mismatch between training and testing conditions. This mismatch is often caused by various kinds of environmental noise or distortion. To reduce the effect of mismatch, feature normalization, feature enhancement, model adaptation, etc. have been studied intensively. Cepstral mean normalization (CMN), mean and variance normalization (MVN) and histogram equalization (HEQ) are well-known methods of feature normalization. Stereo-based piecewise linear compensation for environments (SPLICE) is one of the feature enhancement methods. In this paper, we describe how to combine these methods to effectively improve the robustness of ASR systems. In the experiments performed on the Aurora-2 database, a good combination showed a 41% improvement in word error rate over SPLICE only, and a 25% improvement over the conventional combination of SPLICE and CMN.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

A recursive feature vector normalization approach for robust speech recognition in noise

The acoustic mismatch between testing and training conditions is known to severely degrade the performance of speech recognition systems. Segmental feature vector normalization [8] was found to improve the noise robustness of MFCC feature vectors and to outperform other state-of-the-art noise compensation techniques in speaker-dependent recognition. The objective of feature vector normalization...

متن کامل

Feature and distribution normalization schemes for statistical mismatch reduction in reverberant speech recognition

Reverberant noise has been a major concern in speech recognition systems. Many speech recognition systems, even with state-of-art features, fail to respond to reverberant effects and the recognition rate deteriorates. This paper explores the significance of normalization strategies in reducing statistical mismatches for robust speech recognition in reverberant environment. Most normalization wo...

متن کامل

The dependence of feature vectors under adverse noise

The performance degradation of automatic speech recognition system due to acoustic mismatch in training and testing environment is a severe problem for practical use of speech recognizer [1]. In this paper, we explore the effects of noise on individual speech feature vector statistics, and several feature normalization methods are used to compensate environment influence on feature vectors. We ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012